Optimizing warfarin dosing for patients with atrial fibrillation using machine learning.

Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada. petchj@hhsc.ca. Population Health Research Institute, Hamilton, ON, Canada. petchj@hhsc.ca. Division of Cardiology, Department of Medicine, McMaster University, Hamilton, ON, Canada. petchj@hhsc.ca. Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, ON, Canada. petchj@hhsc.ca. Centre for Data Science and Digital Health, Hamilton Health Sciences, Hamilton, ON, Canada. Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada. Department of Computer Science, University of Toronto, Toronto, ON, Canada. Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA. Institute for Medical and Evaluative Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA. Vector Institute, Toronto, ON, Canada. Population Health Research Institute, Hamilton, ON, Canada. Department of Cardiology, University Medical Center, Johannes Gutenberg University Mainz, Mainz, Germany. Microsoft Research, Montreal, QC, Canada. Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada. Division of Cardiology, Department of Medicine, Duke University Medical Center, Durham, NC, USA. Duke Clinical Research Institute, Duke University, Durham, NC, USA. Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA. Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA. Department of Medical Sciences, Cardiology, Uppsala University, Uppsala, Sweden. Uppsala Clinical Research Center, Uppsala University, Uppsala, Sweden. Division of Hematology and Thromboembolism, Department of Medicine, McMaster University, Hamilton, ON, Canada. Division of Cardiology, Department of Medicine, McMaster University, Hamilton, ON, Canada.

Scientific reports. 2024;(1):4516

Abstract

While novel oral anticoagulants are increasingly used to reduce risk of stroke in patients with atrial fibrillation, vitamin K antagonists such as warfarin continue to be used extensively for stroke prevention across the world. While effective in reducing the risk of strokes, the complex pharmacodynamics of warfarin make it difficult to use clinically, with many patients experiencing under- and/or over- anticoagulation. In this study we employed a novel implementation of deep reinforcement learning to provide clinical decision support to optimize time in therapeutic International Normalized Ratio (INR) range. We used a novel semi-Markov decision process formulation of the Batch-Constrained deep Q-learning algorithm to develop a reinforcement learning model to dynamically recommend optimal warfarin dosing to achieve INR of 2.0-3.0 for patients with atrial fibrillation. The model was developed using data from 22,502 patients in the warfarin treated groups of the pivotal randomized clinical trials of edoxaban (ENGAGE AF-TIMI 48), apixaban (ARISTOTLE) and rivaroxaban (ROCKET AF). The model was externally validated on data from 5730 warfarin-treated patients in a fourth trial of dabigatran (RE-LY) using multilevel regression models to estimate the relationship between center-level algorithm consistent dosing, time in therapeutic INR range (TTR), and a composite clinical outcome of stroke, systemic embolism or major hemorrhage. External validation showed a positive association between center-level algorithm-consistent dosing and TTR (R2 = 0.56). Each 10% increase in algorithm-consistent dosing at the center level independently predicted a 6.78% improvement in TTR (95% CI 6.29, 7.28; p < 0.001) and a 11% decrease in the composite clinical outcome (HR 0.89; 95% CI 0.81, 1.00; p = 0.015). These results were comparable to those of a rules-based clinical algorithm used for benchmarking, for which each 10% increase in algorithm-consistent dosing independently predicted a 6.10% increase in TTR (95% CI 5.67, 6.54, p < 0.001) and a 10% decrease in the composite outcome (HR 0.90; 95% CI 0.83, 0.98, p = 0.018). Our findings suggest that a deep reinforcement learning algorithm can optimize time in therapeutic range for patients taking warfarin. A digital clinical decision support system to promote algorithm-consistent warfarin dosing could optimize time in therapeutic range and improve clinical outcomes in atrial fibrillation globally.